Avoiding Degradation in Deep Feed-forward Networks by Phasing out Skip-connections

ثبت نشده

چکیده

A widely observed phenomenon in deep learning is the degradation problem: increasing the depth of a network leads to a decrease in performance on both test and training data. Novel architectures such as ResNets and Highway networks have addressed this issue by introducing various flavors of skip-connections or gating mechanisms. However, the degradation problem persists in the context of plain feed-forward networks. In this work we propose a simple method to address this issue. The proposed method poses the learning of weights in deep networks as a constrained optimization problem where the presence of skip-connections is penalized by Lagrange multipliers. This allows for skip-connections to be introduced during the early stages of training and subsequently phased out in a principled manner. We demonstrate the benefits of such an approach with experiments on MNIST, fashion-MNIST, CIFAR-10 and CIFAR-100 where the proposed method is shown to greatly decrease the degradation effect and is often competitive with ResNets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Avoiding Degradation in Deep Feed-forward Networks by Phasing out Skip-connections

متن کامل

Variable Activation Networks: a Simple Method to Train Deep Feed-forward Networks without Skip-connections

Novel architectures such as ResNets have enabled the training of very deep feedforward networks via the introduction of skip-connections, leading to state-of-theart results in many applications. Part of the success of ResNets has been attributed to improvements in the conditioning of the optimization problem (e.g., avoiding vanishing and shattered gradients). In this work we propose a simple me...

متن کامل

Beyond Forward Shortcuts: Fully Convolutional Master-Slave Networks (MSNets) with Backward Skip Connections for Semantic Segmentation

Recent deep CNNs contain forward shortcut connections; i.e. skip connections from low to high layers. Reusing features from lower layers that have higher resolution (location information) benefit higher layers to recover lost details and mitigate information degradation. However, during inference the lower layers do not know about high layer features, although they contain contextual high seman...

متن کامل

Highway-LSTM and Recurrent Highway Networks for Speech Recognition

Recently, very deep networks, with as many as hundreds of layers, have shown great success in image classification tasks. One key component that has enabled such deep models is the use of “skip connections”, including either residual or highway connections, to alleviate the vanishing and exploding gradient problems. While these connections have been explored for speech, they have mainly been ex...

متن کامل

Skip Connections Eliminate Singularities

Skip connections made the training of very deep networks possible and have become an indispensable component in a variety of neural architectures. A completely satisfactory explanation for their success remains elusive. Here, we present a novel explanation for the benefits of skip connections in training very deep networks. The difficulty of training deep networks is partly due to the singulari...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Avoiding Degradation in Deep Feed-forward Networks by Phasing out Skip-connections

ثبت نشده

چکیده

منابع مشابه

Avoiding Degradation in Deep Feed-forward Networks by Phasing out Skip-connections

Variable Activation Networks: a Simple Method to Train Deep Feed-forward Networks without Skip-connections

Beyond Forward Shortcuts: Fully Convolutional Master-Slave Networks (MSNets) with Backward Skip Connections for Semantic Segmentation

Highway-LSTM and Recurrent Highway Networks for Speech Recognition

Skip Connections Eliminate Singularities

عنوان ژورنال:

اشتراک گذاری